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Abstract 

A new segmented compressed sampling method for analog-to-information conversion (AIC) is pro- 
posed. An analog signal measured by a number of parallel branches of mixers and integrators (BMIs), each 
characterized by a specific random sampling waveform, is first segmented in time into M segments. Then 
the sub-samples collected on different segments and different BMIs are reused so that a larger number 
of samples than the number of BMIs is collected. This technique is shown to be equivalent to extending 
the measurement matrix, which consists of the BMI sampling waveforms, by adding new rows without 
actually increasing the number of BMIs. We prove that the extended measurement matrix satisfies the 
restricted isometry property with overwhelming probability if the original measurement matrix of BMI 
sampling waveforms satisfies it. We also show that the signal recovery performance can be improved 
significantly if our segmented AIC is used for sampling instead of the conventional AIC. Simulation 
results verify the effectiveness of the proposed segmented compressed sampling method and the validity 
of our theoretical studies. 
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I. Introduction 

According to Shannon's sampling theorem, an analog band-limited signal can be recovered from its 
discrete-time samples if the sampling rate is at least twice the maximum frequency present in the signal. 
Recent theory of compressed sampling (CS), however, suggests that a signal can be recovered from 
fewer samples if it is sparse or compressible Hl-lSl. CS theory also suggests that a universal sampling 
matrix (for example, a random projection matrix) can be designed, and it can be used for all sparse 
signals regardless of their nature 111. CS has already found a wide range of applications such as image 
acquisition Q, sensor networks m, cognitive radios Q, communication channel estimation lEl, lH, etc. 

The sampling process often used in the CS literature consists of two steps. First, an analog signal is 
sampled at the Nyquist rate and then a measurement matrix is applied to the time domain samples in 
order to collect the compressed samples (see, for example, Q). This sampling approach, however, defeats 
one of the primary purposes of CS, which is avoiding high rate sampling. A more practical approach 
for "direct" sampling and compression of analog signals has been presented in |[TOl . The analog signal 
is assumed to belong to the class of signals in shift-invariant spaces, that is, the analog signal can be 
represented as a linear combination of a set of m basis functions defined over a period T. The analog 
signal is first passed through a filter bank where each filter is matched to one of the m basis functions and 
the output is sampled at time instances riT where n is an integer. If the signal is sparse, then only S < m 
samples are nonzero. The set of m output samples are then passed through a measurement matrix to create 
K > S compressed samples representing the analog signal in a specific period [{n — l)T, nT]. It is worth 
mentioning that this method is a generalization of another method in ifTTI which is devised for sub-Nyquist 
sampling of multi-band signals. The limits of this method come from the underlying assumption that the 
signal belongs to the class of signals in shift-invariant spaces. Although this assumption is argued to be 
valid for a variety of engineering applications ifTOl . |[T2l and can be generalized to the signals in a union 
of subspaces |[T3l . |[T4l . it is still a limiting assumption. Moreover, the complexity of this method is by no 
means lower than the complexity of another practical approach to CS, which avoids high rate sampling 
im, |[T5l . The name analog-to-information converter (AIC) has been coined for the latter method. The 
AIC consists of several parallel branches of mixers and integrators (BMIs) in which the analog signal 
is measured against different random sampling waveforms. Therefore, for every collected compressed 
sample, there is a BMI that multiplies the signal to a sampling waveform and then integrates the result 
over a period T. 

In this paper, we propose a new segmented AIC structure with the goal of reducing the hardware 
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complexityy The contributions of this work are the following, (i) A new segmented AIC structure is 
developed. In this structure, the integration period T is divided into M equal subperiods such that the 
sampling rate of our segmented AIC scheme is M times higher than of the AIC of OJ. The sub-samples 
collected over different subperiods by combining the sub-samples from different BMIs are then reused 
in order to build additional samples. In this way, a number of samples larger than the number of BMIs 
can be collected, although such samples will be correlated. We show that our segmented AIC technique 
is equivalent to extending the measurement matrix which consists of the BMI sampling waveforms by 
adding new rows without actually increasing the number of BMIs. In this respect, the following works 
also need to be mentioned ifTTl . ifTSl . In ifTTl . Toeplitz-structured measurement matrices are considered, 
while measurement matrices built on one random vector with shifts of Z) > 1 in between the rows 
appear in radar imaging application considered in |[T8l . (ii) We show that the restricted isometry property 
(RIP), that is a sufficient condition for signal recovery based on compressed samples, is satisfied for the 
extended measurement matrix resulting from the segmented AIC structure with overwhelming probability 
if the original matrix of BMI sampling waveforms satisfies the RIP. Thus, our segmented AIC is a valid 
candidate for CS. (iii) We also show that the signal recovery performance improves if our segmented 
AIC is used for sampling instead of the AIC of [I] with the same number of BMIs. The mathematical 
challenge in this part of the work is that the samples collected by our segmented AIC are correlated, 
while all available results on performance analysis of the signal recovery are obtained for the case of 
uncorrelated samples. 

The rest of this paper is organized as follows. Necessary background on CS, CS signal recovery, and 
AIC is briefly summarized in Section JIl The main idea of the paper, that is, the segmented AIC structure, 
is explained in Section |llll We prove in Section |IV] that the extended measurement matrix resulting from 
the proposed segmented AIC satisfies the RIP and, therefore, the segmented AIC is a legitimate CS 
method. The signal recovery performance analysis for our segmented AIC is summarized in Section |Vl 
Section |Vl] demonstrates the simulation results and Section IVIII concludes the paper. 

II. Background 

CS basics and notations: CS deals with a low rate representation of sparse signals, i.e., such signals 
which have few nonzero projections on the vectors of an orthogonal basis (sparsity basis). Let ^' = 
{il^i, "0^, . . . , '4^%)^ he an N X N matrix of basis vectors -0^, i = 1, . . . , N, i.e., the sparsity basis, and 

'Some preliminary results have been reported in 1161 . 
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/ be a discrete-time sparse signao represented in this basis as 

N 

/ = J^XiV-f = *^^ (1) 

i=l 

where x = {xi,X2, . . . , xn)'^ is the x 1 vector of coefficients and (•)^ and (•)^ stand for the transpose 
and Hermitian transpose, respectively. A signal is 5'-sparse if at most S projections on the rows of 
i.e., coefficients of x, are nonzero. It is known that a universal compressed sampling method can be 
designed to effectively sample and recover S'-sparse signals regardless of the specific sparsity domain 

m, m. 

Among various bounds on the sufficient number of collected compressed sample^ K (S < K < N) 
required for recovering an S'-sparse signal, the first and most popular one is given by the following 
inequality S < CK/\og{N/K) where C is some constant [Ij. This bound is derived based on the 
uniform uncertainty principle ll20l . Let $ be a A' x N measurement matrix applied to a sparse signal 
for collecting K compressed samples. Then the uniform uncertainty principle states that $ must satisfy 
the following restricted isometry property (RIP) HI- Let $7- be a sub-matrix of $ retaining only the 
columns with their indexes in the set T C {1, . . . , A^}. Then the S-restricted isometry constant 5$ is the 
smallest number satisfying the inequality 

|(1 - 6s)\\c\\l < W^TcWl < |(1 + 5s)\\c\\l (2) 

for all sets T of cardinality less than or equal to S and all vectors c (here || • ll;^ denotes the Euclidean 
norm of a vector). As shown in 111, lIlTll . if the entries of $ are, for example, independent zero mean 
Gaussian variables with variance then $ satisfies the RIP for S < CK/\og{N/K) with high 

probability^ 

Recovery methods: Using the measurement matrix the 1 x K vector of compressed samples y can 
be calculated as y = = ^'x where = . A signal can be recovered from its noiseless 

sample vector y based on the following convex optimization problem that can be solved by a linear 
program IS, 1221 

min||i||ij subject to ^ x = y (3) 
where || • ||;^ denotes the /i-norm of a vector. 

^It can be in or C^. 
^See 1191 for broader review. 

■^Note that in order to ensure consistency throughout the paper, the variance of the elements in $ is talcen to be 1/7V instead 
of 1/K as, for example, in ||2j. Thus, the multiplier K/N is added in the left- and right-hand sides of (O. 



April 27, 2010 



DRAFT 



5 



If the compressed samples are noisy, the sampUng process can be expressed as 

y = ^f + w (4) 

where w is a zero mean noise vector with identically and independently distributed (i.i.d.) entries of 
variance a^. Then the recovery problem is modified as |[23l 

min||i||;^ subject to \\^ x — yWi^ < ^ (5) 

where 7 is the bound on the square root of the noise energy. 

Another technique for sparse signal recovery from noisy samples (see H) uses the empirical risk 
minimization method that was first developed in statistical learning theory for approximating an unknown 
function based on noisy measurements ll24l . Note that the empirical risk minimization-based recovery 
method is of a particular interest since under some simplifications (see ||4l p. 4041]) it reduces to another 
well-known least absolute shrinkage and selection operator (LASSO) method ll25l . Therefore, the risk 
minimization-based method of |3l provides the generality which we need in this paper. 

In application to CS, the unknown function is the sparse signal and the noisy compressed samples are 
the collected data. Let the entries of the measurement matrix ^ be selected with equal probability as 
±l/\/iV, and the energy of the signal / be bounded so that < NB'^. The risk r(/) of a candidate 
reconstruction / and its empirical risk f(/) are defined as follows ll24l 

^(/) = ^^ + cx^ n/) = ^i:(%-0,/)'- (6) 

Then the candidate reconstruction obtained based on K samples can be found as lH 

i ■ [..^^ , c(/)log2 l 

fx = arg , mm <^ r(/) + S (7) 

where T{B) = {f : < NB^}, c{f) is a nonnegative number assigned to a candidate signal /, and 
e = 1/ (50(i? + o")^). Moreover, f given by ^ satisfies the following inequality ||4l 

1 N / - ^f^riB) \ N eK f 

where Ci = [(27 - 4.e){B/a)^ + (50 - 4:V2)B/a + 26]/[(23 - 4:e){B/af + (50 - AV2)B/a + 24], 
e = 2.7183 .. ., and E{-} stands for the expectation operation. 

Let a compressible signal / be defined as a sig nal for which ||/(™) - /f < NCAm~^°' where 
is the best m-term approximation of / which is obtained by retaining the ?n most significant coefficients 
of vector x {x being the representation of / in the sparsity basis *), and Ca > and q > are 
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some constants. Let also J='c{B,a,CA) = {f : < NB'^, ||/(™) - /f < NCAm~^'^} be the set of 
compressible signals. Then based on the weight assignment c(/) = 2 log(A^)A'^j; (here Nx is the actual 
number of nonzero coefficients in x) the following inequality holds IH 



sup E 

f&Tc{B,a,CA) 



K 



< C1C2 



K 



-2a/(2a+l) 



(9) 



N \ ~ \\ogN 
where C2 = C2{B,a, Ca) > is a constant. 

If signal / is indeed sparse and belongs to Ts{B,S) = {f : < NB"^, \\f\\ig < S}, then there 

exists a constant C2 = C'2{B , a) > such that [gl 

' I/a— /Pl r K 



sup E 



< CiC'2 



(10) 



N I - ^ V^logiV, 
AIC: The random modulation preintegration (RMPI) structure is proposed for AIC in HI. The RMPI 
multiplies the signal and the sampling waveforms in the analog domain and then integrates the product 
over the signal period to produce samples. It implies that the sampling device has a number of parallel 
BMIs in order to process the analog signal in real-time. The RMPI structure is shown in Fig. [T] where 
f{t) is the analog signal being sampled, cf).i{t), i = 1, . . . ,K are the sampling waveforms (rows of the 
measurement matrix $), and yi, i = 1, . . . , K are the compressed samples. 
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Fig. 1. The structure of the AIC based on RMPI. 



III. Segmented Compressed Sampling Method 

AIC removes the need for high speed sampling, but it may still be necessary in many practical 
applications to collect a larger number of compressed samples than the AIC hardware (the number 
of parallel BMIs) may allow. Indeed, a smaller number of samples may have a negative effect on the 
signal recovery accuracy which can be an issue in a number of applications. In order to collect a larger 
number of compressed samples using AIC, we need to increase the hardware complexity by adding more 
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BMIs. The latter makes the AIC device complex and expensive although its sampling rate is much lower 
than that of analog-to-digital converter (ADC). Therefore, it is desirable to reduce the number of parallel 
BMIs in AIC without sacrificing the signal recovery accuracy. It can be achieved by adding to AIC the 
capability of sampling at a higher rate, which is, however, significantly lower than the sampling rate 
required by ADC. The latter can be achieved by splitting the integration period T in every BMI of the 
AIC in Fig. [T] into shorter subperiods. It is equivalent to generating a number of incomplete samples of a 
signal. Note that since the original integration period is divided into a number of smaller subperiods, the 
samples collected over all parallel BMIs during one subperiod do not have complete information about 
the signal. Therefore, they are called incomplete samples. Hereafter, the complete samples obtained over 
the whole period T are referred to as just samples, while the incomplete samples are referred to as 
sub-samples. 

A. The Basic Idea and the Model 

The basic idea is to collect the sub-samples as described above and then reuse them in order to build 
additional samples. In this manner, a larger number of samples than the number of BMIs can be collected. 
It allows for a tradeoff between AIC and ADC since as in AIC the signal is measured at a low rate by 
correlating it to a number of sampling waveforms, while the integration period is split into shorter sub- 
intervals which is similar to the requirement of a higher sampling rate as in ADC. However, the required 
sampling rate in the proposed scheme is still significantly lower than that required by ADC. 

Let the integration period be split into M sub-intervals, and let ={yk,i, • • • , Vk.M)^ , k = 1, . . . , K 
be the vectors of sub-samples collected against the sampling waveforms cj))., k = 1, . . . , K, where K is 
the original number of sampling waveforms, i.e., the number of BMIs. The sub-sample t/kj is given by 

pT/AI 

ykj= / x{t)(t)^{t)dt. (11) 

J{j-1)T/M 

Then the total number of sub-samples collected in all BMIs over all subperiods is AIK. These sub-samples 
can be gathered in the following K x M matrix 

^ yi,i yi,2 

^ _ 2/2,1 y2,2 
\ VK,! yK,2 

where the k-th row contains the sub-samples obtained by correlating the measured signal with the 
waveform 0^, over M subperiods each of length T/M. 



yi,M 
y2,M 

VKM J 



(12) 
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The original K samples, i.e., the samples collected at BMIs over the whole time period T, are 

M 

yk = Y.^Y]k,m, k = l,...,K (13) 

m=l 

where [l^]fc.m denotes the (A;,m)-th element of Y, that is, [l^]A;,m = yk,m- 

In order to construct additional samples to the samples obtained using (fT3l) . we consider columnwise 
permuted versions of Y. The following definitions are then in order. 

The permutation vr is a one-to-one mapping of the elements of a set V to itself by simply changing 
the order of the elements. Then 7r(/c) stands for the index of the A;-th element in the permuted set. 
For example, let V consists of the elements of a x 1 vector z, and the order of the elements in 
V is the same as in z. After applying the permutation function vr to z, the permuted vector is z'" = 
(z^l-i), . . . , • • • 1 Zt^{K))^ ■ If vector z is itself the vector of indexes, i.e., z = (1, . . . , K)"^ , then 
obviously ^^^(fc) = T^{k). 

The permuted versions of the sub-sample matrix Y can be obtained by applying different permu- 
tations to different columns of Y. Specifically, let "P^*) = {7r^*\ . . . , vrj*^ . . . , vr^j } be the i-th set 
of column permutations with -k- being the permutation function applied to the j-th column of Y, 
and let / stand for the number of such permutation sets. Then according to the above notations, the 
matrix resulting from applying the set of permutations "P^*) to the columns of Y can be expressed as 
"K^* ' = (^\^ , ■ ■ ■ , ) • • • ) y^f ^ where yj is the j-th column of Y. 

Permutation sets V^'^\ i = 1, . . . , / are chosen in such a way that all sub-samples in a specific row 
of Y come from different rows of the original sub-sample matrix Y as well as from different rows 
of other permuted matrices "K^*^' , ■ ■ ■ , 1^^'' ^' . For example, all sub-samples in a specific row of 1^^*^' 
must come from different rows of the original matrix Y only, while the sub-samples in a specific row 
of Y come from different rows of Y and Y and so on. This requirement is forced to make sure 
that any additional sample has the least possible correlation with the original samples of ([T3l) . Then the 
additional K I samples can be obtained based on the permuted matrices Y , i = 1, . . . , / as 

M 

= k = l,...,K i = !,...,!. (14) 

m=l 

It is worth noting that in terms of the hardware structure, the sub-samples used to generate additional 
samples must be chosen from different BMIs as well as different integration subperiods. This is equivalent 
to collecting additional samples by correlating the signal with additional sampling waveforms which are 
not present among the actual BMI sampling waveforms. Each of these additional sampling waveforms 
comprises the non-overlapping subperiods of M different original waveforms. 
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Now the question is how many permuted matrices, which satisfy the above summarized conditions, 
can be generated based on Y. Consider the following K x M matrix 

Z ^{z,z,...,z) (15) 

^ V ' 

M times 

where z is the vector of indexes. Applying the column permutation set P^*^ to the columns of Z, we 
obtain a permuted matrix Z^*' = ^z'^i z^^^ z'^'^'^. Then the set of all permuted versions of 
Z can be denoted as Sz = {Z^* . . . , Z^'^'}. With these notations, the following theorem is in order. 

Theorem 1. The size of Sz, i-e., the number I of permutation sets V^^\ i = which satisfy the 

conditions 

/ VZ^"' e 5z, j / r, G {1, . . . , K], j, r G {1, . . . , M] (16) 

3!i or $j such that [Z^'%, = [Z^'"],,,, VZ^<", Z^'" G Sz, Z^'^' / Z^"\ Vj G {1, . . . ,M} 

VA;,/iG {!,..., if} (17) 
is at most K — 1. Here [Z^* stands for the {k,j)-th element of the permuted matrix Z^' 

Remark 1. Using the property that z^i^j.-^ = TT(k) for the vector of indexes z, the conditions (1161) and 
(fTTl ) can also be expressed in terms of permutations as 

Trf{k)^4\k) yi €{!,...,!}, j^r, k€{l,...,K}, j, r G {1, . . . , M} (18) 
3\j or $j such thatTTf{k) = ^f{h) Vi, /G {1, . . . , /}, i^l, VjG{l, . . . ,M},VA;,/iG{l, . . . (19) 



Proof: See Appendix A. ■ 
Example 1: Let the specific choice of index permutations he-Ksik) = {{s + k — 2) mod K) + l, s,k = 
1, . . . ,K with TTi being the identity permutation and 'mod' standing for the modulo operation. For this 

(i) 

specific choice, ttj = 7r[j(j_i) ,„od K]+i, i = ^, ■ ■ ■ K — 1, j = 1, . . . , M. Consider the following matrix 
notation for the set V where the elements along the i-th row are the permutations V^^\ i = 1, . . . , I 
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V 



( \ 

p(2) 
p(3) 

j)(K~2) 
( TTi 7r2 



(1) 

1 


4" 


4" . 


■ 4",' 


(2) 
1 




. 


. 4? 


(3) 
1 


4" 




. 4? 



{A'-2) {K-2) {K-2) 



TT. 



vr 



1 



7I"[2(Af-l) mod ft:]+l 
^[3(A/-1) mod K]+l 



TT 



(^-2) 
M 



(20) 



' '^[(ii'-2){J\/-l) modii']+l 
' '^[(K-1){A/-1) mod K]+l J 

Note that not all permutations P^*), i = 1, . . . , / used in (l20l ) may be permissible. In fact, the set of 
permutations "P^*) with K/gcd{i, K) < M has at least one repeated permutation that contradicts the 
condition (fTSl ). Here 5'C(i(-, •) stands for the greatest common devisor of two numbers. For example, for 
= 8 and M = 4, K/gcd{4:, K) = 2 < M and is impermissible. Therefore, instead oi K-l = 7, 
only the following 6 sets of permutations are allowed 



/ (1) (1) (1) (1) \ 



V 



vr, 



vri 



vr 



(2) m (2) (2) 



vr, 



vr; 



TT 



(3) ^(3) ^(3) 



TT. 



VTo 



vr 



(3) 



vr 



(4) ^(4) ^(4) 



vr, 



vr,^ 



TT 



(4) 



TT 



(5) _(5) _(5) _{5) 



vr, 



VTo 



\ ^1 



(6) (6) (6) J6) 



TT. 



TTi 



^ TTi TT2 Vr3 714 

VTi VTs vrs Try 

VTi Vr4 TTy TT2 

TTi VTg TTs TTg 

VTi VTy TTs TTs 

y TTi VTg TTy TTg y 



(21) 



Theorem [T] shows how many different permuted versions of the original sub-sample matrix Y can 
be obtained such that the correlation between the original and additional samples would be minimal. 
Indeed, since the set of sub-samples that are used to build additional samples is chosen in a way that 
additional samples have at most one sub-sample in common with the previous samples, i.e., conditions 
([TS] ) and ([T9l ) are satisfied, the set of permutations dlOl ) is a valid candidate. The i-th element of V, i.e.. 



the element 



vr 



1 ' 



ttI*} ), is the set of permutations applied to Y to obtain Y''"'' . Adding up 



•p(i) 



the entries along the rows of 1" , a set of additional samples can be obtained. 
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Example 2: Let the number of new samples Ka be at most K. This means that all permutations are 
given by only V^^^ in (l20b . In this special case, the sub-sample selection method can be summarized as 
follows. For constructing the {K + l)-st sample, M sub-samples on the main diagonal of Y are summed 
up together. Then the M sub-samples on the second diagonal are used to construct the {K + 2)-nd 
sample, and so on up to the Ka-th sample. Mathematically, the so constructed additional samples can be 
expressed in terms of the elements of Y as 

M 

VK+k = yi,m-, k = l...,Ka (22) 

m=l 

where I = [{k + m — 2) mod K] + 1 and Ka < K. Fig. |2] shows schematically how the sub-samples are 
selected in this example. 




Fig. 2. Sub-sample selection principle for building additional samples in Example 2. 

Our segmented sampling process can be equivalently expressed in terms of the measurement matrix. Let 
$ be the original KxN measurement matrix. Let the k-th row of the matrix $ be 0;. = (0^, , . . . , j,/) 
where 4>k,j^ 3 = 1, . . . , M are some vectors. Let for simplicity, the length of 0;, ^ be N/M and N/M 
be an integer number. The set of permutations applied to Y in order to obtain 1^^''' is P^*). Then 
the operation can be expressed as follows. The first N/M columns of which are the vectors 

0fc 1) ^ ^ {li -^}' permuted with vr^*''. The second N/M columns of $ are permuted with 7r2^ and 
so on until the last N/M columns of $ which are permuted with vr^*^^. Then the extended measurement 
matrix which combines all possible permutations i = 1, . . . , / can be expressed as 




where Ke = K + Ka = K + KI. 



April 27, 2010 



DRAFT 



12 



Example 3: Continuing with the set up used in Example 2, let Ka < K. Then the extended measure- 
ment matrix is 




4>K,l 



1,2 



(f>K,2 
02,2 



where $i contains only Ka rows of and $ 



^1. 



M 



\ 



4>K,M 
4>M,M 



n2(Ka),M ■ ■ ■ Vttm 



if = K. 



(24) 



B. Implementation Issues and Discussion 

Due to the special structure of the extended measurement matrix $e, the sampUng hardware needs only 
K parallel BMIs for collecting KI samples. These BMIs are essentially the same as those in Fig. [T] The 
only difference is that the integration period T is divided into M equal subperiods. After every subperiod, 
each integrator's output is sampled and the integrator is reset. In addition, a multiplexer which selects 
the sub-samples for constructing additional samples is needed. Note that partial sums can be kept for 
constructing the samples (original and additional), that is, the results of the integration are updated and 
accumulated for each sample iteratively after each subperiod. In this way, there is no need of designing 
the circuitry to memorize the matrix of sub-samples Y , but only the partial sums for each sample are 
memorized at any current subperiod. 

Since the proposed segmented AIC scheme collects the sub-samples at the M times higher rate than 
the AIC in Fig. [H an improved signal recovery performance is expected. It agrees with the convention that 
the recovery performance cannot be improved only due to the post processing. Moreover, note that since 
the original random sampling waveforms are linearly independent with high probability, the additional 
sampling waveforms of our segmented compressed sampling method are also linearly independent with 
overwhelming probability. However, a sufficient condition that guarantees that the extended measurement 
matrix of the proposed segmented AIC scheme is an eligible choice is the RIP. Therefore, the RIP for 
the proposed segmented compressed sampling scheme is analyzed in the next section. 



IV. RIP FOR THE SEGMENTED COMPRESSED SAMPLING METHOD 

The purpose of this section is to show that the extended measurement matrix $e in (l23l) satisfies the 
RIP if the original measurement matrix $ satisfies it. The latter will also imply that $e can be used 
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as a valid CS measurement matrix. In our set up it is only assumed that the elements of the original 
measurement matrix are i.i.d. zero mean Gaussian variables and the measurement matrix is extended by 
adding its permuted versions as described in the previous section. 

Let us first consider the special case of Example 3. In this case, $i, and $e are the original 
measurement matrix, the matrix of additional sampling waveforms, and the extended measurement matrix 
given by (l24l ). respectively. Let the matrix $ satisfy the RIP with sufficiently high probability. For 
example, let the elements of $ be i.i.d. zero mean Gaussian random variables with variance Let 
T be any subset of size S of the set {1, . . . , A^}. Then for any < < 1, the matrix $7-, which is a 
sub-matrix of $ which consists of only the columns with their indexes in the set T satisfies Q with the 
following probability 1(211 

Pr{*r satisfies ©j > 1 - 2 (12/^5)^ ^-Co(Ss/2)K ^25) 

where Co (55/2) = 5|/16 — (^|/48. Hereafter, the notation Co is used instead of Co ((^s'/2) for brevity. 
First, the following auxiliary result on the extended measurement matrix $e is of interest. 

Lemma 1. Let the elements of the measurement matrix $ be i.i.d. zero mean Gaussian variables with 
variance 1/N, $e be formed as shown in (1241) . and T C {1, . . . , N} of size S. If Ka is chosen such 
that min{K, Ka + M — 1} < \{K + Ka) /2], then for any < 63 < ^, the following inequality holds 

Pr{($e)r satisfies ©} > 1 - 4 (12/(5s)^ g-'^^L-^J (26) 

where [x] and [x\ are the smallest integer larger than or equal to x and the largest integer smaller 
than or equal to x, respectively, and Co is a constant given after (1251) . 

Proof: See Appendix B. ■ 
Using the above lemma, the following main result, which states that the extended measurement matrix 
$e in (l24l ) satisfies the RIP, can be also proved. 

Theorem 2. Let $e be formed as in (1241 ) and let the elements of^ be i.i.d. zero mean Gaussian variables 
with variance 1/N. If }nin{K, Ka + M — 1} < \{K + Ka)/2\, then for any < ^5 < 1, there exist 
constants C3 and C4, which depend only on 5s, such that for S < C-^KK + J^a)/2j/log(A^/S') the 
inequality ^ holds for all S -sparse vectors with probability that satisfies the following inequality 

Pr{*e satisfies RIP} > 1 - 4e-C4L(^+/^.)/2j (-27) 

where C4 = Cq — C3 [! + (! + log {12/6s)) / log (N/S)] and C3 is small enough that guarantees that 
Ci is positive. 
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Proof: See Appendix C. ■ 
Let us consider now the general case when the number of additional samples Ka is larger than the 
number of BMIs K, i.e., Ka > K, K^. > 2K, and the extended measurement matrix is given by ( [23] ). 
Note that while proving Lemma [T] for the special case of Example 3, we were able to split the rows 
of $e into two sets each consisting of independent entries. In the general case, some of the entries of 
the original measurement matrix appear more than twice in the extended measurement matrix and 
it is no longer possible to split the rows of $e into only two sets with independent entries. Due to the 
way that the additional samples are built, the samples yiK+i,yiK+2, ■ ■ ■ ■,y{i+i)K obtained based on the 
permuted matrix Y^'", i.e., the /-th set of additional samples, are uncorrelated with each other, but they 
are correlated with every other set of samples based on the original matrix Y and the permuted matrices 
Y Thus, the following principle can be used while partitioning the rows of $e into the 

sets with independent entries. First, the rows corresponding to the original samples form a single set 
with independent entries, then the rows corresponding to the first set of additional samples based on the 
matrix 1^^*^' form another set and so on. Then the number of such sets is Up = \K(.jK\, while the size 
of each set is 

u 1^0^ r_ 

The extended measurement matrix (l23l ) can be rewritten as 

*e= ((*e)r,(*e)L---,(*e)rj'^ (29) 

where (^e), is the i-th partition of $e of size given by (l28l ). Then the general form of Lemma [T] is as 
follows. 

Lemma 2. Let the elements of the measurement matrix $ he i.i.d. zero mean Gaussian variables with 
variance 1/N, $e be the extended measurement matrix (I23I ). and T C {1, • • • ,N} of size S. Let also 
Ka > K and Up = \Ke/K~\. Then, for any < 5s < I, the following inequality holds 

Pr{(*e)r satisfies ©I > 1 - 2{np - 1) il2/6sf (e"^"^) - 2 (12/55)^ {e'^"^-^) (30) 

where Kn^, = Kg — ([-^1 — l) K and Co is a constant given after (1251 ). 

Proof: See Appendix D. ■ 
Lemma 2 is needed to prove that the extended measurement matrix ( |29l ) satisfies the RIP. Therefore, 
the general version of Theorem |2] is as follows. 



Ki 



K, 1 < i < _ 1 

(28) 
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Theorem 3. Let the elements of ^ be i.i.d. zero mean Gaussian variables with variance 1/N and $e 
be formed as in (I23I ). If Ka > K, then for any < (5s < 1, there exist constants C3, C4 and C4, such 
that for S < C-^Kn^ / \og{N / S) the inequality © holds for all S-sparse vectors with probability that 
satisfies the following inequality 

Pr{*e satisfies RIP] > 1 - 2{np - l)e"^^^ - le'^^^"" (31) 

where C'^ = Cq - (CzKnjK) x [1 + (1 + log (12/55)) / log I S)\ d is given after and C3 is 
small enough to guarantee that C4 and C4 are both positive. 

Proof: See Appendix E. ■ 
When splitting the rows of $e in a number of sets as described before Lemma |2] it may happen that the 
last subset (^e)„ has the smallest size Kn^. As a result, the dominant term in (|3T]) will likely be the term 
2g-C4-ftr„p Moreover, it may lead to a more stringent sparsity condition, that is, S < C^Kn^/ log{N/ S). 
To improve the lower bound in (|3TI ). we can move some of the rows from ($e)„ _i to (*e)„ in order 
to make the last two partitions of almost the same size. Then the requirement on the sparsity level 
will become S < C3K' / log{N/ S) where K' = [{K + Kn^)/2\. Therefore, the lower bound on the 
probability calculated in (OTI ) improves. 

V. Performance Analysis of the recovery 

In this section, we aim at answering the question whether signal recovery also improves if the proposed 
segmented AIC method, i.e., the extended measurement matrix $e (1231 ). is used instead of the original 
matrix The study is performed based on the empirical risk minimization method for signal recovery 
from noisy random projections |!4^. As mentioned in Section II, the LASSO method can be viewed as 
one of the possible implementations of the empirical risk minimization method. 

We first consider the special case of Example 3 when the extended measurement matrix is given by 
(l24l ). Let the entries of the measurement matrix $ be selected with equal probability as ±.1/^/N, i.e., 
be i.i.d. Bernoulli distributed with variance 1/A^. This assumption is the same as in ||4l and it is used 
here in order to shorten our derivations by only emphasizing the differences caused by our construction 
of matrix where some rows are correlated to each other, as compared to the case analyzed in iH, 
where the measurement matrix consists of all i.i.d. entries. Note that our results can be easily applied to 
the case of Gaussian distributed entries of $ by only changing the moments of Bernoulli distribution to 
the moments of Gaussian distribution. 



April 27, 2010 



DRAFT 



16 



Let r{f, f) = r{f) — r{f) be the "excess risk" between the candidate reconstruction / of the signal 
sampled using the extended measurement matrix $e and the actual signal /, and r(/, /) = f(/) — f(/) 
be the "empirical excess risk" between the candidate signal reconstruction and the actual signal, where 
r(/) and r(/) are defined in Then the difference between the "excess risk" and the "empirical excess 
risk" can be found as 

rih f) - fCf, f) = JfT. - ^[f^^]) (32) 

i=i 

where U, ^ {y, - cf^^ff - {y, - cj^fff. 

The mean-square error (MSB) between the candidate reconstruction and the actual signal can be 
expressed as |[24l 

M^-E ^E{\\gf}=NrCf J) (33) 

where g = f — f- Therefore, if we know an upper bound on the right-hand side of ([32]), denoted hereafter 
as U, we can immediately find an upper bound on the MSB in the form MSE < Nr{f, f) + NU. In 
other words, to find the candidate reconstruction / one can minimize r(/, /) + U, that will also result 
in a bound on the MSB as in ([8]). 

The Craig-Bernstein inequality ||4l, ll26l can be used in order to find an upper bound U on the right- 
hand side of ( [32l) . In the notations of our paper, this inequality states that the probability of the following 
event 

^E(^,-m))<4y+ (34) 

is greater than or equal to 1 — 5 for < e/i < C < 1> if the random variables Uj satisfy the following 
moment condition for some h > and all A; > 2 

E{\U,-Emf}< '"'"^"^'>'''"\ (35) 

The second term in the right-hand side of ([34]) contains the variance var |^^^ which we need to 
calculate or at least find an upper bound on it. 

In the case of the extended measurement matrix, the random variables Uj, j = 1, . . . , Ke all satisfy the 



moment condition for the Craig-Bernstein inequa 



ity (26) with the same coefficient h = 16B'^e + 8^/2Ba, 



where is the variance of the Gaussian noisejf Moreover, it is easy to show that the following bound 

'The derivation of the coefficient h coincides with a similar derivation in I?), and therefore, is omitted. 
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on the variance of Uj is valid for the extended measurement matrijq^ 

var{C/,} < (^2M! + 4^7^^ M! < ^SB^ + 4^2) r(/, /). (36) 

However, unlike H, in the case of the extended measurement matrix, the variables Uj are not 
independent from each other. Thus, we can not simply replace the term var with ^^m of 

the variances for Uj, j = I, . . . , K^. Using the definition of the variance, we can write that 

2\ / / \ \ 2 



var 



j=l i=l j=i+l ^ 

/ ||2\2\ K,-l K, / 

j=i \ \ / / j=i j=i+i \ \ 

K, / ||2\2\ 

= 5^var{C/,} + 2 J] M^m} " ^ (37) 

j=l i=l j=i+l \ ^ ^ / 

where the upper bound on varjL'j} is given by (l36l ). Using the fact that the random noise components 
Wi and Wj are independent from (p^^g and cjijg (see the noisy model (HJl), respectively, E{UiUj} can be 
expressed as 

E{U,U,} = E{[2w,cl>,g-{<t,,gf][2w,cl>^g-ict>^gf]} 

= 4E{wiWj}E{cf>igcf>jg} - 2E{w,}E{cf>,gi(f>j9f} 

- 2E{wj}E{(t>^g{<l,,gf} + E{{<t^,gf{<t^jgf}. (38) 

The latter expression can be further simplified using the fact that E{wi} = E{wj} = 0. Thus, we obtain 
that 

E{U,Uj} = AE{wiWj}E{{(l},g){cl)jg)} + (0,^)2(0 .5)2} . (39) 

It is easy to verify that if (p^ and cf}j are independent, then E{UiUj) = E \^{<p^g)'^^ E \^{cj)jg)^^ 
= (119 1|2/ A^)^ which indeed coincides with iH- However, in our case, cj>^ and 0^ may depend on each 
other. If they indeed depend on each other, they have L = N/M common entries, while the rest of 
the entries are independent. In addition, the additive noise terms Wi and wj are no longer independent 

'This bound also coincides with a similar one in l4l 
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random variables as well and, thus, E\^WiWj} = a'^/M. Without loss of generality, let the first L entries 
of (pi and 4>j be the same, that is, 

A Pi 



(t>ig = 9iai + • • • + QLCLL + gL+l4'i,L+l + ■ ■ ■ + gN4>i,N (40) 
A P5 

, ^ s / s 

4>jg = aiai + • • • + giaL + gL+i4'j,L+i + • • • + gN4>j,N (41) 

with ai, being the common part between cjj- and (p^. 

Let be a sub-vector of g containing the L elements of g corresponding to the common part between 
0j and cf)j, and gi^/ be the sub-vector comprising the rest of the elements. Then using the fact that A, 
Pi, and Pj are all zero mean independent random variables, we can express E{{cj)ig){cf>jg)} from the 
first term on the right-hand side of ( |39l ) as 

E{{(t>.,g){(t>^g)} = E{{A + P,){A + P,)} = E{A'} + E{AP,} + E{APj} + E{PiP,} 

r. (Z]fc=i5fc) llq.lp 
= E{A'} = ^ ^ ^ = (42) 



Similar, the second term on the right-hand side of ( 1391 ) can be expressed as 

E{i(t>,gfi(t>^gf} = E{{A^ + P^ + 2AP,){A^ + P/ + 2AP,)} . (43) 
Using the facts that AE{wiWj} = Aa^/M, EiA^} = Wg^f/N, and E{Pl} = {{g^, \\^/N, the expression 



(1431 ) can be further rewritten as 

E{i<t,.g?i<l>,9?} =E{A' + A'Pf + A^P] + PfP]} = E{A^] + 2«M£^M^ + (^1^"' ' 

(44) 

Substituting (02]) and (04]) into ([391), we obtain that 
Moreover, substituting ( |45l) into ( [37] ). we find that 



var 



\ / .|| II2X 2 A 2 W 112' 



3=1 \ j=l <^;,0^ dependent \ / 

Using the fact that the extended measurement matrix is constructed such that the waveforms 0j, 
i = K + \, . . . ,K(, are built upon M rows of the original matrix and also using the inequalitj^ E{A'^} — 

^We skip the derivation of this inequality since it is relatively well known and can be found, for example, in 01 p. 4039]. 
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{\\9a\?/^)^ < 2 (llfifAp/^)^ for all these M rows, we obtain for every cj)^, i = K + I, . . . ,Ke that 

where corresponds to the first L entries of for /c = 1, to the entries from L + 1 to 2L for /c = 2 
and so on. Applying also the triangle inequality, we find that 

y;^«%l^!.M^)<2fM)%i^!.M^ (48) 

/-^\\NIM N \ - \ N I M N ^ ^ 



, , N J M N - \ N M N 

k=l \ ^ ^ / ^ ^ 

Combining ( [47l ) and ( |48l ) and using the fact that there are Ka additional rows in the extended 
measurement matrix, we obtain that 

2 V (E{AU-0^]\'-^-^^]<iKjm\^JMl^ (49) 

^ \ ^ ' \ N I M N - \ N M N ^ 

<7i;,Q!)j dependent \ ^ ' J ^ ' 

Noticing that ||gi|p/A^ = r{f,f) and ||gf|p < ANB'^, the right-hand side of the inequality ( |49l ) can be 
further upper bounded as 

Using the upper bound dSOl ) for the second term in ( |46l) and the upper bound ( [36b for the first term in 
(l46l ). we finally can upper bound the var as 



K, 



2Ka 




MKe 



rCfJ)- (51) 



Therefore, based on the Craig-Bernstein inequality, the probability that for a given candidate signal / 
the following inequality holds 



riJJ)-Ht,f)<4^A-^ ^' (52) 

is greater than or equal to 1 — 5. 

Let c(/) be chosen such that the Kraft inequality 2'^^'^^ < 1 is satisfied (see also |l4l), and 

let = 2-'=('^) 5. Applying the union bound to (l52l ). it can be shown that for all / G ^{B) and for 
all 5 > the following inequality holds with probability of at least 1 — 5 



Age 2(1 - 
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Finally, setting C, = eh and 



A'e ; ^ I ^ MK. 



2(1-0 



(54) 



e < T — 7 ^ ^ ^ 7 N" (55) 

(4 (l + ^) + 16e) + + 2.^ (i + ^) 

where 0<e/i<C<las required by the Craig-Bernstein inequality, the following inequality holds with 
probability of at least 1 — 5 iov all / G J^{B) 

(1 - a r /,/ < r(/,/ + . (56) 

The following result on the recovery performance of the empirical risk minimization method is in 
order. 



Theorem 4. Let e be chosen as 



m{B + (Tf 



(57) 



which satisfies the inequality (I55I ). then the signal reconstruction fj^ given by 



satisfies the following inequality 



f^^ = arg . min { f(/) + I (58) 



where Cie is the constant given as 

l-a (30 -8e) (I) + (60-4^2) (I) +30 

with a obtained from (l54l) for the specific choice of e in (|57] ). 

Proof: The proof follows the exact steps of the proof of the related result for the uncorrelated case 
in p. 4039^040] with the exception of using, in our correlated case, the above calculated values for e 
(157] ) and a ■ 
Example 4: Let one set of samples be obtained based on the measurement matrix $e with Ka = K, 
Kg = 2K, and M = 8, and let another set of samples be obtained using a 2K x N measurement 
matrix with all i.i.d. (Bernoulli) elements. Let also e be selected as (|57] ). Then the MSB error bounds 
for these two cases differ from each other only by a constant factor given for the former case by Cie in 
(|60l ) and in the latter case by Ci (see dS) and the row after). Considering the two limiting cases when 
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S/cj — )■ and B /a — )• oo, the intervals of change for the corresponding coefficients can be obtained as 
1.08 < Cie < 2.88 and 1.06 < Ci < 1.63, respectively. 

The following result on the achievable recovery performance for a sparse or compressible signal 
sampled based on the extended measurement matrix $e is also of great interest. 

Theorem 5. For a sparse signal f G Ts{B,S) = {f : < NB"^, II/IL — '^^'^ corresponding 

reconstructed signal fj^^ obtained according to (|58] ). there exists a constant = C2^{B,a) > 0, such 
that 

sup E I ; -^"' l < gleCL ( -J^) ■ (61) 



Similar, for a compressible signal f e Tc{B,a,CA) = {f : < NB^, < iVC^m-2a| 

and corresponding reconstructed signal f obtained according to (l58l) , there exists a constant C2e = 
C2e(-B, (7, Ca) > 0, such that 

sup E < } < CieC2e ■ (62) 



/GJ-c(B,a,CA) [ ^ J \logN 

Proof: The proof follows the exact steps of the proofs of the related results for the uncorrelated case 
in p. 4040-4041] with the exception of using, in our correlated case, the above calculated values for e 
(157] ) and a ■ 
Example 5: Let one set of samples be obtained based on the extended measurement matrix $e with 
Ka = K, Kg = 2K, and M = 8 and let another set of samples be obtained using the KxN measurement 
matrix with all i.i.d. (Bernoulli) elements. The error bounds corresponding to the case of K uncorrelated 
samples of ||4l and our case of Kf. correlated samples are (ITOl) and (|6T] ). respectively. The comparison 
between these two error bounds boils down in this example to comparing 2C1C2 and CieC2g. Assuming 
the same e as (l57l) for both methods, the following holds true = C2. Fig. [3] compares Cie and 2Ci 
versus the signal-to-noise ratio (SNR) B^/cx^. Since Cie < 2Ci for all values of SNR, the quality of the 
signal recovery, i.e., the corresponding MSB, for the case of 2K x N extended measurement matrix is 
expected to be better than the quality of the signal recovery for the case of K x N measurement matrix 
of all i.i.d. entries. 

The above results can be easily generalized for the case when Ka > K. Indeed, we only need to 
recalculate var|^j^^f/j| for Ka > 2K. The only difference with the previous case of Ka < K is 
the increased number of pairs of dependent rows in the extended measurement matrix $e, which has a 
larger size now. The latter affects only the second term in (l46l ). In particular, every row in depends 
on M rows of the original measurement matrix Moreover, the term Sj=i+i E{UiUj} over 
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Fig. 3. Cie and 2Ci versus SNR. 



all these M rows is bounded as in (1481 ). Then considering all KM pairs of dependent rows from $ and 
we have 



2 M^i^'} 

0;,</)j dependent \ 



|2\ 2 



M N 



|2\ 2 



iV 



+ 



^a'^K \\g\ 



M N 



(63) 



Similar, every row of ^ depends on M rows of *^ and M rows of Considering all these 
2KM pairs of dependent rows, we have 



2 

</)j,0j dependent \ 



\9a 



|2\ 2 



+ 



4t^^ llffAl 



< 4(2i^) 



|2\ 2 



8(j2(2if) ||£,||2 

+ T^-^. (64) 



N J M N I - ' ' \ N J M N 

Finally, the number of rows in the last matrix (*e)„ is (see (1281 ) and ([29l)). Every row of (^e)„ 
depends on M rows of each of the previous Up — 1 matrices $ , i = 1, . . . ,np — 1. Considering all 
(rip — l)Kn^M pairs of dependent rows, we have 

\\9a 



2 [^^^'y 

ipit'Pj dependent \ 



N 



+ 



M N 



< 4{np-l)K„ 



N 



+- 



M 



Based on the equations (|37] ) and (|63])-(|65]) we can find the following bound 



var 



N 
(65) 



(66) 



where D = 2K Y^^Li ^ ^ + (f^p — 1)- Note that in the case that = UpK, we have D/Kg = Up — l. 
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Therefore, it can be shown for the general extended matrix (1231 ) that the inequaUty (1561 ) holds with the 
following values of a and e: 



8^2(1 + 1- +4<t2 1 + ^)U 



2(1-0 



(67) 



e < N N ^ 7 r (68) 

4(l + #)+16e)S2 + 8^^ + 2f72(^l+ ^ 



Moreover, the theorems similar to Theorems |4] and [5] follow straightforwardly with the corrections to a 
and e which are given now by (|67] ) and (l68l) . respectively. 

We finally make some remarks on non-RIP conditions for Zi-norm-based recovery. Since the extended 
measurement matrix of the proposed segmented compressed sampling method satisfies the RIP, the results 
of |[23l on recoverability and stability of the Zi-norm minimization straightforwardly apply. A different 
non-RIP-based approach for studying the recoverability and stability of the /i-norm minimization, which 
uses some properties of the null space of the measurement matrix, is used in |[27l . Then the non-RIP 
sufficient condition for recoverability of a sparse signal from its noiseless compressed samples with the 
algorithm © is 123 

(69) 



\/5 < min |o-5pp : v e Wi^) \ {0}}| 



where AA($) denotes the null space of the measurement matrix 

Let us show that the condition ( [69l ) is also satisfied for the extended measurement matrix $e- Let d be 
any vector in the null space of d G A/'($e)- Therefore, [$e]jci = 0, i = 1, . . . , Kg where [*e]t is 

the i-th 1 X row-vector of Since the first K rows of $e are exactly the same as the K rows of we 
have = 0, i = I, . . . ,K. Therefore, d G M{^), and we can conclude that 7V($e) C 7V(*). Due 

to this property, we have min{0.5||i7||;j/||i7||/, : v^J\f{^)} < m.iTi{0.5\\v\\iJ\\v\\i^ : v^J\f{^e)}- 
Therefore, if the original measurement matrix $ satisfies ( [69l ). so does the extended measurement matrix 
and the signal is recoverable from the samples taken by $e- 

Moreover, the necessary and sufficient condition for all signals with ||a;||i^ < 5 to be recoverable from 
noiseless compressed samples using the /i-norm minimization ^ is that |[27l 

\\v\\i, > 2\\vt\\i„ yv G {AA(*) \ {0}} (70) 

where T is the set of indexes corresponding to the nonzero coefficients of x. It is easy to see that since 
J\f{^e) C M{^), the condition dTOl ) also holds for the extended measurement matrix if the original 
measurement matrix satisfies it. 
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VI. Simulation Results 

Throughout our simulations we use the sparse signal of dimension 128 with only 3 nonzero entries, 
which are set to ±1 with equal probabilities. Since the signal is sparse in the time domain, = I. The 
collected samples are assumed to be noisy, i.e., the model (01) is applied. In all our simulation examples, 
three different measurement matrices (sampling schemes) are used: (i) the K x N measurement matrix $ 
with i.i.d. entries referred to as the original measurement matrix; (ii) the extended K^x N measurement 
matrix $e obtained using the proposed segmented compressed sampling method and referred to as the 
extended measurement matrix; and (iii) the K^. x N measurement matrix with all i.i.d entries referred to 
as the enlarged measurement matrix. This last measurement matrix corresponds to the sampling scheme 
with K(. independent BMIs in the AIC in Fig. [1] The number of segments in the proposed segmented 
compressed sampling method M is set to 8. To make sure that the measurement noise for additional 
samples obtained based on the extended measurement matrix is correlated with the measurement noise 
of the original samples, the K x M matrix of noisy sub-samples with the noise variance jM is first 
generated. Then the permutations are applied to this matrix and the sub-samples along each row of the 
original and permuted matrices are added up together to build the noisy samples. 

The recovery performance for three aforementioned sampling schemes is measured using the MSB 
between the recovered and original signals. In all examples, MSB values are computed based on 5000 
independent simulation runs for all sampling schemes tested. The SNR is defined as 
Approximating ||$/||^_^ by (A''/A^)||/||^_^, which is valid because of Q, the corresponding noise variance 
can be calculated if SNR is given, and vise versa. Here K' = K for the sampling scheme based on the 
original measurement matrix, while K' = in the other two schemes. For example, the approximate 
SNR in dBs can be calculated as lOlogio (S/iVcj^). 

Recovery based on the li-norm minimization algorithm: In our first simulation example, the /i-norm 
minimization algorithm ([5]) is used to recover a signal sampled using the three aforementioned sampling 
schemes. Since ^ = I, then = $ in ([S]). The number of BMIs in the sampling device is taken to 
he. K = 16, while 7 in which is the bound on the root square of the noise energy, is set to \fK'a. 
The entries of the original and enlarged measurement matrices are generated as i.i.d. Gaussian distributed 
random variables with zero mean and variance 

Fig. |4] shows the MSBs corresponding to all three aforementioned measurement matrices versus the 
ratio of the number of additional samples to the number of original samples KajK. The results are shown 
for three different SNR values of 5, 15 and 25 dB. It can be seen from the figure that better recovery 
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Fig. 4. Recovery based on the /i-norm minimization algorithm: MSEs versus Ka/K. 



quality is achieved by using the extended measurement matrix as compared to the original measurement 
matrix. The improvements are more significant for high SNRs since the recovery error is proportional to 
the noise power |[23l . As expected, the recovery performance in the case of the extended measurement 
matrix is not as good as in the case of the enlarged measurement matrix. This difference, however, is 
small as compared to the performance improvement over the original measurement matrix. Note also that 
in the case of the enlarged measurement matrix, the AIC in Fig. [T] consists of K(, BMIs, while only K 
BMIs are required in the case of the extended measurement matrix. Thus, the segmented AIC requires 
Kf. — K less BMIs. For example, the number of such BMIs halves if Ka/K = 1. Additionally, it can 
be seen that the rate of MSE improvement decreases as the number of collected samples increases. The 
latter can be observed for both the extended and enlarged measurement matrices and for all three values 
of SNR. 

Recovery based on the empirical risk minimization method: In our second simulation example, the 
empirical risk minimization method is used to recover a signal sampled using the three aforementioned 
sampling schemes tested with K = 24. The minimization problem (|7]) is solved to obtain a candidate 
reconstruction f j^, of the original sparse signal /. Considering f j^, = ^^xk', the problem ([7]l can be 
rewritten in terms of xk' as 



Xk' = arg min { fi^^x) + 

xex I eK' 



c{x) log 2 



1 fn. X ff n9 21og21ogiV„ „ 1 

^ =argmin<^ \\{y) - ^iif^ x\\l + ^ I (71) 



and solved using the iterative bound optimization procedure H. This procedure uses the threshold 
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(a) Measurement matrix with Gaussian distributed entries (b) Measurement matrix with Bernoulli distributed entries 
Fig. 5. Recovery based on the empirical risk minimization method: MSEs versus Ka/K. 



y^2log'21og~/V7Ae, where A is the largest eigenvalue of the matrix In our simulations, this threshold 

is set to 0.035 for the case of the extended measurement matrix and 0.05 for the cases of the original 
and enlarged measurement matrices. These threshold values are optimized as recommended in @- The 
stopping criterion for the iterative bound optimization procedure < 9, where 11. |L 

is the loo norm and denotes the value of x obtained in the i-th iteration. The value 9 = 0.001 is 
selected. 

Fig. [5] shows the MSEs obtained based on the empirical risk minimization method for all three 
measurement matrices versus the ratio Ka/K. The results are shown for three different SNR values of 
5, 15 and 25 dB. Two cases are considered: (a) the entries of the original and the enlarged measurement 
matrices are generated as i.i.d. zero mean Gaussian distributed random variables with variance 
and (b) the entries of the original and enlarged measurement matrices are generated as i.i.d. zero 
mean Bernoulli distributed random variables with variance as in case (a). The same conclusions as 
in the first example can be drawn in this example. Moreover, the results for cases (a) and (b) are also 
similar. Therefore, the proposed segmented AIC indeed leads to significantly improved signal recovery 
performance without increasing the number of BMIs. 

VII. Conclusion 

A new segmented compressed sampling method for AIC has been proposed. According to this method, 
signal is segmented into M segments and passed through K BMIs of AIC to generate a. K x M matrix 
of sub-samples. Then, a number of correlated samples larger than the number of BMIs is constructed by 
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adding up different subsets of sub-samples selected in a specific manner. Due to the inherent structure 
of the method, the complexity of the sampling device is almost unchanged, while the signal recovery 
performance is shown to be significantly improved. The complexity increase is only due to the M times 
higher sampling rate and the necessity to solve a larger size optimization problem at the recovery stage, 
while the number of BMIs remains the same at the sampling stage. The validity and superiority of the 
proposed segmented AIC method over the conventional AIC is justified through theoretical analysis of 
the RIP and the quality of signal recovery. Simulation results also verify the effectiveness and superiority 
of the proposed segmented AIC method and approve our theoretical studies. 

Appendix A: Proof of Theorem 1 

The total number of possible permutations of z is K\. Let A be the set of permutations vr^, s = 
1, . . . , 1^1 that satisfy the following condition 

Tis{k)^Tit{k), s^t, Vs,tG{l,...,|^|}, VfcG {!,..., /C}. (72) 

It is easy to see that the number of distinct permutations satisfying the condition (1721 ) is iT, so |^| = K. 
It is also straightforward to see that the choice of such K distinct permutations is not unique. As a 
specific choice, let the elements of A, i.e., the permutations tTs, s = 1, . . . , iT, be 

■K,{k) = {{s + k-2)moAK) + l, s,k = l,...,K (73) 

with vTi being the identity permutation, i.e., the permutations that does not change z. 

Consider now the matrix Z which consists of M columns z. The i-th set of column permutations of 
matrix Z is P^*) = {vr j'^ , • • • , tt^'^^ } and the corresponding permuted matrix is Z^* ' . Let {vrj^*^ , • • • , vr^*^^ } 
be any combination of the K permutations in (1731 ). Then there are possible choices for V^^\ However, 
not all of these possible choices are permissible by the conditions of the theorem. 

Indeed, let the set V^^^ be a combination of permutations from A that satisfies ([T8] ). There are 
I — 1 other sets V^^\ i = 2, . . . , / which satisfy both ([TSl l and ( fT9l ). Gathering all such sets in one 
set, we obtain the set V = . . . , Now let = [tt[^^^\ . . . ,tt[\'^^^] be one more 

set of permutations where BvTm^^'*, rn = 1, . . . ,M such that vTm^^^ ^ A. An arbitrary k-th row of 
is {[Z^"'\,u...,[Z^"^\,m) where . . . , [Z^'^^^'],,,/ € {!,..., K}. This 

exact same row can be found as the first row of one of the permuted matrices Z^* ', "P^*) G V. 
Specifically, this is the permuted matrix Z^' ' that is obtained by applying the permutations = 
< 7r,„p(/+i), , . . . ,7r,_p(/+i), >. The permutations P^*^ either has to belong to V or being crossed out 

L Jfc.l [-^ \k,M J 
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1 TTi , . . 







from V because of conflicting with some other element 'P(') G V, I i. In both cases, can not 

be added to V because it will contradict the conditions (fTSl ) and (fT9l ). 

Therefore, the set V can be built using only the permutations from the set A, i.e., the K permutations 
in (1731) . Rearranging the rows of Z^' ' in a certain way, one can force the elements in the first column of 
Z to appear in the original increasing order, i.e., enforce the first column be equivalent to the vector 
of indexes z. It can be done by applying to each permutation in the set P^*) the inverse permutation 
[j^i^^ ' which itself is one of the permutations in (1731) . Therefore, the set "P^*) = {vr[*\ . . . ,7r^^} can 

be replaced by the equivalent set | (^^^i^^ \ • • • > (^'^i^ 

vTi is the identity permutation and (j^i^^^ ^f' £ Hence, we can consider only the permutations of 
the form P^*) = {vri, . . . , 7rj'\ . . . , tt^/}. Since the condition (ITSl ) requires that 712^ should be different 
from vTi, the only available options for the permutations on the second column of Z are the K — 1 
permutations 7^2, ■ ■ ■ ,ttk in (1731) . Therefore, / at most equals K — 1. Note that / can be smaller than 
K — 1 ii for some i G {1, . . . , if — 1}, K/gcd{i, K) < M (also see Example 1 after Theorem 1). Thus, 
in general I < K — 1. 

Appendix B: Proof of Lemma 1 

Let all the rows of ($e)-^ be partitioned into two sets of sizes (cardinality) as close as possible to each 
other, where all elements in each set are guaranteed to be statistically independent. In particular, note 
that the elements of the new Ka rows of $e are chosen either from the first Ka + M — 1 rows of $ if 
Ka + M -1 < K or from the whole matrix Therefore, if Ka + M-1 < K, the last K-Ka-M + 1 
rows of $ play no role whatsoever in the process of extending the measurement matrix and they are 
independent on the rows of $1 in (l24l ). These rows are called unused rows. Thus, one can freely add 
any number of such unused rows to the set of rows in $1 without disrupting its status of being formed 
by independent Gaussian variables. Since min{K, Ka + M — 1} < \{K + Ka) /2], there exist at least 
[{K + Ka) /2j — Ka unused rows which can be added to the set of rows in $1. Such process describes 
how the rows of (^6)7- are split into the desired sets {^e)j- and (^6)7- of statistically independent 
elements. As a result, the first matrix {^e)j- includes the first \{K + Ka) /2] rows of {^e)-]-, while the 
rest of the rows are included in ($6)7-- 

Since the elements of the matrices (^6)7- and (^6)7- are i.i.d. Gaussian, they will satisfy ^ with 
probabilities equal or larger than 1-2 {12/5s)^ e"^« ^^"^^'^ and 1 - 2 {12/5s)^ e"'^° L^'/^J , respectively. 
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Therefore, both matrices (^e)-;- and {^e)j- satisfy (O simultaneously with the common probability 

Pr{(*e)r satisfies ©} > 1 - 2{12/5sf e''^"^^"^'^^ , i = 1,2. (74) 

Let K[ = \Ke/2] and K'2 = [Ke/2\. Consider the event when both {^e)r and {^e)r satisfy ©. 
Then the following inequality hold for any vector c G M'^: 

2 2 2 

E - ^ E ll(*e)rc||?, < E + ^5)||c||f, (75) 

i=l 1=1 1=1 



or, equivalently. 



^{l-6s)\\c\\l < me)Tc\\l < ^{l + 5s)\\c\\l (76) 



Therefore, if both matrices (^e)^ and (^6)7- satisfy dill, then the matrix ($e)T also satisfies Q. 
Moreover, the probability that (*e)r does not satisfy © can be found as 



Pr{(*e)r does not satisfy ©} < Pr{{^e)r or i^e)r does not satisfy ©j 

(a) ^ 

- E P''{(*e)r does not satisfy ©} 



i=l 

< 4(12/(5s)^e-^°Li^=/2J (77) 

where the inequality (a) follows from the union bounding and the inequality (b) follows from (IT?]) . Thus, 
the inequality (l26l ) holds. 

Appendix C: Proof of Theorem 2 

According to (l26l ). the matrix (*e)7- does not satisfy (|2l) with probability less than or equal to 
4 {l2/6sf g-^^Li^e/sj for any subset T C {1, . . . , iV} of cardinaUty 5. Since there are (^) < {Ne/S)^ 
different subsets T of cardinality S, $e does not satisfy the RIP with probability 

Pr{*e does not satisfy RIP} < ^(^^ {12/ 63)^ e"^°^^^/^^ 

< 4 (iVe/S")^ (12/(55)'^ e~^"L^=/2J = 4e-(CoL^^e/2j-5[iog(7Ve/s)+iog(i2/5s)]) 

< 4g-{CoL/^./2j-C3[log(Afe/5)+log(12/5s)]Li^e/2j/log(7V/5)} 

= 4e-{Co-C3[i+(i+iog(i2/5s))/iog(JV/s)]}L/^./2j_ (.-73^ 

Setting C4 = Co — C3 [1 + (1 + log {12/6s)) / log [N/S]] and choosing C3 small enough that guarantees 
that C4 is positive, we obtain (|27]| . 
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Appendix D: Proof of Lemma 2 

The method of the proof is the same as the one used to prove Lemma [T] and is based on spUtting the 
rows of $e into a number of sets with independent entries. Here, the splitting is carried out as shown in 

Let ($e)7-i ^ = l,...,np — 1 be the matrix containing the {i — 1)K + 1-th to the iK-t\\ rows 
of ($e)T- The last — (rip — 1)K rows of ($e)T form the matrix (^g)!^". Since the matrices 
i = 1, . . . ,np — 1 consist of independent entries, they satisfy Q each with probability of at 
least 1 - 2 (12/55)'^ e"*^"^'. For the same reason, the matrix {^eiq^ satisfies ^ with probability greater 
than or equal to 1 — 2 (12/5s)'^ e"*^"^"?. In the event that all the matrices (^ejV' ^ — satisfy 
Q simultaneously, for c G M*^ we have 

E §(1 - < f; me)vc\\i < i: § (1 + m-wi 

i=l i=l i=l 

^{l-5s)\\c\\l < me)rc\\l < ^{l + 6s)\\c\\l (79) 
Therefore, using the union bound and (1791) . we can conclude that 

rip 

Pr{(*e)r does not satisfy dH)} < '^PT{{^e)r does not satisfy dH)} 

i=l 

< 2{np - 1) (12/55)'' (e-^°^) + 2 {12/6sf (e-^"^"^') (80) 
which proves the lemma. 

Appendix E: Proof of Theorem 3 

According to Lemma 2, for any subset T C {1, • • • ,N} of cardinahty S, the probability that ($e)T 
does not satisfy dH) is less than or equal to 2{np - 1) (12/55)'^ (e""^"^) + 2 (12/^5)'^ [e'^"^"'). Using 
the fact that there are (^) < (Ne/S)"'^ different subsets T, the probability that the extended measurement 
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matrix $e does not satisfy the RIP can be computed as 
Pr{*e does not satisfy the RIP} < 2{np - 1) {l2/6sf e'^"^ ^ ^(5) ^'^'^^^^^^ 6"^°^"^ 

< 2(np - 1) [Ne/Sf {12/6sf e"^"^ + 2 {Ne/Sf {12/6sf e"^"^"- 

= 2{np - l)e-(Coi^-5[log(7Ve/S)+log{12/5s)]) _^ 2e-('^°^"p-^['°§(^''/^)+'°s(12/5s)]) 

< 2{np - i)e-{c«^-^^[i°s(^e/^)+i°g(i2/<5s)]i^/iog(JV/S)} 

= 2(np-l)e"{'^''"^^'^"^^^"^^°^^^^^''''^^^^°^^^'''^^'}^ + 2e~-t'^"-^^[^+(i+'°s(i2/5s))/iog(iV/S)]}i^„^_ 

(81) 

Denoting the constant terms as C4 = Cq - C3 [1 + (1 + log(12/55)) / log (N/S)] and C'^ = Cq - 
(CsKrip/K) X [1 + (1 + log (12/(55)) / log (Af/S')], and choosing C3 small enough in order to guarantee 
that C4 and C4 are positive, we obtain (|3T]) . 
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